home *** CD-ROM | disk | FTP | other *** search
Text File | 1997-07-20 | 63.5 KB | 1,317 lines |
- __
- /_/\__
- \ \ \ |\
- ____\ \ \||__
- / / \\ \ __/
- | | /\ \\ \ \
- | | \/ \\_\/
- ____\ \ /\ |
- / / \\ \ \/ |
- | | /\/_\_\__/
- | | \/ \
- ____\ \ /\ |
- / / \\ \ \/ | ASP68K PROJECT
- /_/ /\ |\_\__/
- \ \ \/ | Sixth Edition
- ____\ \ /
- / / \\ \ \ by Michael Glew
- | | /\/_\ \ \ mglew@laurel.ocs.mq.edu.au
- | | \/ \\_\/ Technophilia BBS +61-2-8073563
- ____\_\__/\ |
- / / \ /_/\/ | January 1994
- | | /\ \\_\__/
- | | \/ \
- \ \ /\ \
- \ \ \_\/
- \ \ \
- \_\/
-
-
- ---------------------------------------------------------------------------
- C O N T R I B U T O R S
- ---------------------------------------------------------------------------
-
-
- Erik Bakke, Robert Barton, Bernd Blank, Kasimir Blomstedt, Frans Bouma,
- David Carson, Nicolas Dade, Aaron Digulla, Irmen de Jong, Andy Duplain,
- Denis Duplan, Steven Eker, Calle Englund, Alexander Fritsch, Charlie Gibbs,
- Kurt Haenen, Jon Hudson, Kjetil Jacobsen, Olav Kalgraf, Makoto Kamada,
- Markku Kolkka, John Lane, Jonathan Mahaffy, Dave Mc Mahan, Lindsay Meek,
- Walter Misar, Boerge Noest, Gunnar Rxnning, Jay Scott, Olaf Seibert,
- Peter Simons.
-
-
- ---------------------------------------------------------------------------
- I N T R O D U C T I O N
- ---------------------------------------------------------------------------
-
-
- A while back, I was quite interested to find that there was an electronic
- magazine called "howtocode" that included lots of interesting hints and
- tips of coding. In the fifth edition, there was a list of optimizations
- that really got be thinking. "What if there was a proggy that you could
- put an assembler program through, that would speed it up, taking out all
- the stupid things output by compilers, and over-tired coders?" 8). I
- started combing the networks, and came across one such program, called
- the "SELCO Source Optimizer". It only had four optimizations, so I set
- to writing my own.
-
- Step one was to collect as many optimization ideas as I could. I posted
- to Usenet and got an impressive response, and the contributors are listed
- above. I promised a report on the optimizations recieved, and here it
- is. My aim now is to write a program to make these optimizations, and
- to distribute it. Contributers will recieve a copy of the final archive,
- to thank them for their time and energy. Further contributions will be
- welcomed, so rather than making changes yourself tell me what you want
- changed, and i'll distribute it with the next update.
-
-
- ---------------------------------------------------------------------------
- C H A N G E S
- ---------------------------------------------------------------------------
-
-
- 2nd Edition
-
- The second edition incorporated a hell of a lot of corrections. Double
- copies of some optimizations were incorporated in to just one copy, and
- a few additions were made. Sorry that the first edition was not sent
- out to all contributors, but I was a tad busy. 8)
-
-
- 3rd Edition
-
- Due to the distribution of the second edition document, many comments were
- recieved and a couple of the "optimizations" were found to be incorrect.
- Analysis of the mul/div optimizations ended in a few modifications for
- safety. They still save a huge number of clock cycles, so it is better to
- be safe than sorry.
-
- Also, I have made it so that the number of words of space saved or
- increased is shown. Space savings are positive, increases are negative.
- Zero means no change.
-
-
- 4th Edition
-
- Some minor changes and additions as well as the addition of columns for
- '030 and '040 CPUs - whole new format was required...
-
-
- 5th Edition
-
- Eric Bakke released his docs on 020+ CPUs and 881/882 FPUs. I have been
- given premission to use these docs to further the capabilities of asp68k.
- Thanks Eric... I really would like to get a hold of the 020,030,040
- Programmer Reference Cards or manuals, so if anyone has any copies they
- wanna send me, let me know... Local Motorola Distributers are not too
- helpful.
-
-
- 6th Edition
-
- Aaron Digulla advised that it would be helpful if the optimizations were
- sorted somehow. I will sort by the the first letters of the first line
- of the optimizations. Also a special thanks to Makoto Kamada for his
- detailed contributions, without such this text would have died long ago..
-
-
- 7th Edition (UNOFFICIAL, but have to be done IMHO)
-
- · Added 68060 timings
- · Removed "RTD dx" and "ASL #n,az" like instructions, and some other
- useless optimizations.
- · Added some tips for 68030/68040/68060 coding
-
- As far as I know, 68020 instruction timings are very closed to 68030
- instruction timings. Since I'm not 100% sure about that, I leave the
- 68020 column empty...
-
- ---------------------------------------------------------------------------
- O P T I M I Z A T I O N S
- ---------------------------------------------------------------------------
-
-
- Note:-
-
- m? = memory operand
- dx = data register
- ds = data register (scratch)
- ax = address register
- rx = either a data or address register
- #n = immediate operand
- ??,?1,?2= address label
- * = anything
- .x = any size
- b<cc> = branch commands
-
- Opt = optimization
- Notes = notes about where optimization is valid, and misc notes
- Speed = are clock periods saved? ("Y" = yes
- "y" = in some cases
- "N" = no
- "*" = increase
- "-" = cannot be used on this cpu
- "!" = must be used on this cpu
- Size = how many bytes are saved?
-
- -----------------------------------------------------------------
- Opt Speed Size
- 000 010 020 030 040 060
- ------------------------------------+---+---+---+---+---+---+----
- * ??* -> * n(pc)* | Y | Y | ? | N | * | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- n = ??-pc, n < 32768
- ------------------------------------+---+---+---+---+---+---+----
- *0(ax)* -> *(ax)* | Y | Y | ? | y | y | y | 2
- ------------------------------------+---+---+---+---+---+---+----
- add*.x #0,dx -> tst.x dx | Y | Y | ? | Y | N | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- add.x #n,* -> addq.x #n,* | Y | Y | ? | Y | y | y | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- if 1 <= n <= 8
- ------------------------------------+---+---+---+---+---+---+----
- add.x #n,* -> subq.x #-n,* | Y | Y | ? | Y | N | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- -8 <= n <= -1
- ------------------------------------+---+---+---+---+---+---+----
- add.x #n,ax -> lea n(ax),ax | Y | Y | ? | y | * | N | 0/2
- ------------------------------------+---+---+---+---+---+---+----
- -32767 <= n <= -9, 9 <= n <= 32767
- ------------------------------------+---+---+---+---+---+---+----
- addq.l #n,ax -> addq.w #n,ax | Y | Y | ? | N | N | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- addq.l #n,ry -> add.l #(n+m),ry | Y | Y | ? | Y | * | * |-2
- addq.l #m,ry | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #2,ax -> move.w *,(ax) | Y | Y | ? | Y | Y | Y | 2
- move.w *,-(ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #4,ax -> move.l *,(ax) | Y | Y | ? | Y | Y | Y | 2
- move.l *,-(ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #6,ax -> move.w *1,4(ax) | Y | Y | ? | Y | y | y | 0
- move.w *1,-(ax) move.l *2,(ax) | | | | | | |
- move.l *2,-(ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- *1 and *2 do not contain ax
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #6,ax -> move.l *1,2(ax) | Y | Y | ? | Y | y | y | 0
- move.l *1,-(ax) move.w *2,(ax) | | | | | | |
- move.w *2,-(ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- *1 and *2 do not contain ax
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #8,ax -> move.l *1,4(ax) | Y | Y | ? | Y | y | y | 0
- move.l *1,-(ax) move.l *2,(ax) | | | | | | |
- move.l *2,-(ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- *1 and *2 do not contain ax
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #4,sp -> move.l ax,(sp) | Y | Y | ? | Y | Y | Y | 2
- pea (ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ax,ay are not a7(=sp)
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #6,sp -> move.w *,4(sp) | Y | Y | ? | Y | Y | Y | 0
- move.w *,-(sp) move.l ax,(sp) | | | | | | |
- pea (ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ax,ay are not a7(=sp)
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #6,sp -> move.l ax,2(sp) | Y | Y | ? | Y | Y | Y | 0
- pea (ax) move.w *,(sp) | | | | | | |
- move.w *,-(sp) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ax,ay are not a7(=sp)
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #8,sp -> move.l *,4(sp) | Y | Y | ? | Y | Y | Y | 0
- move.l *,-(sp) move.l ax,(sp) | | | | | | |
- pea (ax) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ax,ay are not a7(=sp)
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #8,sp -> move.l ax,4(sp) | Y | Y | ? | Y | Y | Y | 0
- pea (ax) move.l *,(sp) | | | | | | |
- move.l *,-(sp) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ax,ay are not a7(=sp)
- ------------------------------------+---+---+---+---+---+---+----
- addq.x #8,sp -> move.l ax,4(sp) | Y | Y | ? | Y | Y | Y | 0
- pea (ax) move.l ay,(sp) | | | | | | |
- pea (ay) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- .x is .w or .l
- ax,ay are not a7(=sp)
- ------------------------------------+---+---+---+---+---+---+----
- and.l #n,dx -> bclr.l #b,dx | Y | Y | ? | N | * | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- not(n) = 2^b (only 1 bit off)
- ------------------------------------+---+---+---+---+---+---+----
- asl.b #2,dy -> add.b dy,dy | Y | Y | ? | Y | Y | * |-2
- add.b dy,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- asl.b #n,dx -> clr.b dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=8
- ------------------------------------+---+---+---+---+---+---+----
- asl.l #16,dx -> swap dx | Y | Y | ? | N | N | * |-2
- clr.w dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- asl.l #n,dx -> asl.w #(n-16),dx | Y | Y | ? | * | * | * |-4
- swap dx | | | | | | |
- clr.w dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, 16<n<32
- ------------------------------------+---+---+---+---+---+---+----
- asl.l #n,dx -> moveq #0,dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=32
- ------------------------------------+---+---+---+---+---+---+----
- asl.w #2,dy -> add.w dy,dy | Y | Y | ? | Y | Y | * |-2
- add.w dy,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- asl.w #n,dx -> clr.w dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=16
- ------------------------------------+---+---+---+---+---+---+----
- asl.x #1,dy -> add.x dy,dy | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- asr.b #n,dx -> clr.b dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=8
- ------------------------------------+---+---+---+---+---+---+----
- asr.l #16,dx -> swap dx | Y | Y | ? | * | * | * |-2
- ext.l dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- asr.l #n,dx -> moveq #0,dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=32
- ------------------------------------+---+---+---+---+---+---+----
- asr.l #n,dx -> swap dx | Y | Y | ? | * | * | * |-4
- asr.w #(n-16),dx | | | | | | |
- ext.l dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, 16<n<32
- ------------------------------------+---+---+---+---+---+---+----
- asr.w #n,dx -> clr.w dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=16
- ------------------------------------+---+---+---+---+---+---+----
- b<cc>.w ?? -> b<cc>.s ?? | Y | Y | ? | Y | N | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- abs(??-pc)<128
- ------------------------------------+---+---+---+---+---+---+----
- bclr.l #n,dx -> and.w #m,dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- 0 <= n <= 15, m = 65535-(2^n)
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bra ?? -> (nothing) | Y | Y | Y | Y | Y | Y | 2/4
- ?? ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- remove null branches, but keep the label
- ------------------------------------+---+---+---+---+---+---+----
- bset.b #7,m? -> tas m? | y | y | ? | * | * | * | 2
- beq ?? bpl ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- m? must be address allowing read-modify-write transfer.
- Status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bset.b #7,m? -> tas m? | y | y | ? | * | * | * | 2
- bne ?? bmi ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- m? must be address allowing read-modify-write transfer.
- Status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bset.b #7,m? -> tas m? | y | y | ? | * | * | * | 2
- ------------------------------------+---+---+---+---+---+---+----
- m? must be address allowing read-modify-write transfer.
- Status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bset.l #7,dx -> tas dx | Y | Y | ? | Y | y | N | 2
- beq ?? bpl ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bset.l #7,dx -> tas dx | Y | Y | ? | Y | y | N | 2
- bne ?? bmi ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bset.l #7,dx -> tas dx | Y | Y | ? | Y | y | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bset.l #n,dx -> or.w #m,dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- 0 <= n <= 15, m = 2^n
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- bsr ?? -> bra ?? | Y | Y | ? | Y | Y | Y | 2
- rts | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- different stack depth
- ------------------------------------+---+---+---+---+---+---+----
- btst.b #7,m? -> tst.b m? | Y | Y | ? | Y | Y | y | 2
- beq ?? bpl ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong. Not valid for Dn, d16(PC), d8(PC,Xn)
- dest address modes.
- ------------------------------------+---+---+---+---+---+---+----
- btst.b #7,m? -> tst.b m? | Y | Y | ? | Y | Y | y | 2
- bne ?? bmi ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong. Not valid for Dn, d16(PC), d8(PC,Xn)
- dest address modes.
- ------------------------------------+---+---+---+---+---+---+----
- btst.l #7,dx -> tst.b dx | Y | Y | ? | Y | N | N | 2
- beq ?? bpl ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong.
- ------------------------------------+---+---+---+---+---+---+----
- btst.l #7,dx -> tst.b dx | Y | Y | ? | Y | N | N | 2
- bne ?? bmi ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong.
- ------------------------------------+---+---+---+---+---+---+----
- btst.l #15,dx -> tst.w dx | Y | Y | ? | Y | N | N | 2
- beq ?? bpl ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong.
- ------------------------------------+---+---+---+---+---+---+----
- btst.l #15,dx -> tst.w dx | Y | Y | ? | Y | N | N | 2
- bne ?? bmi ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong.
- ------------------------------------+---+---+---+---+---+---+----
- btst.l #31,dx -> tst.l dx | Y | Y | ? | Y | N | N | 2
- beq ?? bpl ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- Status flags are wrong.
- ------------------------------------+---+---+---+---+---+---+----
- btst.l #31,dx -> tst.l dx | Y | Y | ? | Y | N | N | 2
- bne ?? bmi ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- clr.b mn -> clr.w mn | Y | Y | ? | Y | Y | Y |2/4/6
- clr.b mn+1 | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- best if mn is longword aligned
- ------------------------------------+---+---+---+---+---+---+----
- clr.l dx -> moveq #0,dx | Y | Y | ? | N | N | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- clr.w mn -> clr.l mn | Y | Y | ? | Y | Y | Y |2/4/6
- clr.w mn+2 | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- best if mn is longword aligned
- ------------------------------------+---+---+---+---+---+---+----
- clr.x -(ax) -> move.x ds,-(ax) | Y | Y | ? | Y | N | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- ds must equal zero
- ------------------------------------+---+---+---+---+---+---+----
- clr.x n(ax,rx) -> move.x ds,n(ax,rx)| Y | Y | ? | Y | N | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- ds must equal zero
- ------------------------------------+---+---+---+---+---+---+----
- cmp.x #0,ax -> move.x ax,ds | Y | Y | ? | Y | N | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- move ax to scratch register
- ------------------------------------+---+---+---+---+---+---+----
- cmp.x #0,ax -> tst.x ax | - | - | ? | Y | N | N | ?
- ------------------------------------+---+---+---+---+---+---+----
- for .w and .l
- ------------------------------------+---+---+---+---+---+---+----
- cmp.x #0,dx -> tst.x dx | Y | Y | ? | N | N | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- cmp.x #0,m? -> tst.x m? | Y | Y | ? | Y | Y | Y | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- may not be legal on some early '000 CPUs
- ------------------------------------+---+---+---+---+---+---+----
- divu.l #n,dx -> lsr.l #m,dx | ! | ! | ? | ! | ! | ! | 4
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 1 <= m <= 8
- ------------------------------------+---+---+---+---+---+---+----
- divu.l #n,dx -> moveq #0,dx | ! | ! | ? | ! | ! | ! | 4
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, m>=32
- ------------------------------------+---+---+---+---+---+---+----
- divu.l #n,dx -> moveq #m,ds | ! | ! | ? | ! | ! | ! | 2
- lsr.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 8<m<32
- ------------------------------------+---+---+---+---+---+---+----
- divu.w #n,dx -> lsr.l #m,dx | Y | Y | ? | ! | ! | ! | 2
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 1 <= m <= 8, ignore remainder
- ------------------------------------+---+---+---+---+---+---+----
- divu.w #n,dx -> moveq #0,dx | Y | Y | ? | ! | ! | ! | 2
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, m>=32
- ------------------------------------+---+---+---+---+---+---+----
- divu.w #n,dx -> moveq #m,ds | Y | Y | ? | ! | ! | ! | 0
- lsr.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 8<m<32, ignore remainder
- ------------------------------------+---+---+---+---+---+---+----
- eor.x #-1,* -> not.x * | Y | Y | ? | Y | N | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- ext.w dx -> extb.l dx | - | - | ? | ? | Y | Y | 2
- ext.l dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- jmp ?? -> bra.w ?? | Y | Y | ? | y | Y | y | 2
- ------------------------------------+---+---+---+---+---+---+----
- abs(??-pc) < 32768, same section
- ------------------------------------+---+---+---+---+---+---+----
- jsr * -> jmp * | Y | Y | ? | Y | Y | Y | 2
- rts | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- different stack depth
- ------------------------------------+---+---+---+---+---+---+----
- jsr ?1 -> pea ?2 | y | y | ? | ? | Y | y | 0
- jmp ?2 jmp ?1 | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- same time if jsr is abs.l (68000/68010)
- ------------------------------------+---+---+---+---+---+---+----
- jsr ?? -> bsr.w ?? | Y | Y | ? | y | Y | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- abs(??-pc) < 32768, same section
- ------------------------------------+---+---+---+---+---+---+----
- lea (ax),ax -> (nothing) | Y | Y | Y | Y | Y | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- delete
- ------------------------------------+---+---+---+---+---+---+----
- lea 0.w,ax -> sub.l ax,ax | Y | Y | ? | Y | * | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- lea n(ax),ax -> addq.w #n,ax | Y | Y | ? | Y | Y | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- if 1 <= n <= 8
- ------------------------------------+---+---+---+---+---+---+----
- lea n(ax),ax -> subq.w #-n,ax | Y | Y | ? | Y | Y | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- if -8 <= n <= -1
- ------------------------------------+---+---+---+---+---+---+----
- lsl.b #2,dy -> add.b dy,dy | Y | Y | ? | Y | N | * |-2
- add.b dy,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- lsl.b #n,dx -> clr.b dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=8
- ------------------------------------+---+---+---+---+---+---+----
- lsl.l #16,dx -> swap dx | Y | Y | ? | N | * | * |-2
- clr.w dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- lsl.l #n,dx -> lsl.w #(n-16),dx | Y | Y | ? | * | * | * |-4
- swap dx | | | | | | |
- clr.w dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, 16<n<32
- ------------------------------------+---+---+---+---+---+---+----
- lsl.l #n,dx -> moveq #0,dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=32
- ------------------------------------+---+---+---+---+---+---+----
- lsl.w #2,dy -> add.w dy,dy | Y | Y | ? | Y | N | * |-2
- add.w dy,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- lsl.w #n,dx -> clr.w dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=16
- ------------------------------------+---+---+---+---+---+---+----
- lsl.x #1,dy -> add.x dy,dy | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- lsr.b #n,dx -> clr.b dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=8
- ------------------------------------+---+---+---+---+---+---+----
- lsr.l #16,dx -> clr.w dx | Y | Y | ? | Y | N | * |-2
- swap dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- lsr.l #n,dx -> clr.w dx | Y | Y | ? | * | * | * |-4
- swap dx | | | | | | |
- lsr.w #(n-16),dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, 16<n<32
- ------------------------------------+---+---+---+---+---+---+----
- lsr.l #n,dx -> moveq #0,dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=32
- ------------------------------------+---+---+---+---+---+---+----
- lsr.w #n,dx -> clr.w dx | Y | Y | ? | Y | Y | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong, n>=16
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,(ax) -> st (ax) | Y | Y | ? | * | * | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,(ax)+ -> st (ax)+ | N | N | ? | * |Y/*| N | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,-(ax) -> st -(ax) | N | N | ? | * |Y/*| N | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,?? -> st ?? | Y | Y | ? | * |Y/*| Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,dx -> st dx | Y | Y | ? | N | * | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,n(ax) -> st n(ax) | Y | Y | ? | * |Y/N| Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #-1,n(ax,rx) -> st n(ax,rx) | Y | Y | ? | * | * | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- move.b #x,mn -> move.w #xy,mn | Y | Y | ? | Y | Y | Y |4/6/8
- move.b #y,mn+1 | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- best if mn is longword aligned
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,-(sp) -> pea n.w | Y | Y | ? | N | N | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- -32767 <= n <= 32767
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,ax -> move.w #n,ax | Y | Y | ? | Y | N | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- -32767 <= n <= 32767
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #-128,dx | Y | Y | ? | Y | * | * | 2
- subq.l #n+128,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- -136 <= n <= -129
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #m,dx | Y | Y | ? | Y | * | * | 2
- not.b dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- 128 <= n <= 255, m = 255-n
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #m,dx | Y | Y | ? | Y | * | * | 2
- not.w dx | | | | | | |
- | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- 65534 <= n <= 65408 or -65409 <= n <= -65536, m = 65535-abs(n)
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #m,dx | Y | Y | ? | N | * | * | 2
- swap dx | | | | | | |
- | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- -8323073 <= n <= -65537 or 4096 <= n <= 8323072, n = m*65536
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #n,dx | Y | Y | ? | Y | N | N | 4
- ------------------------------------+---+---+---+---+---+---+----
- if -128 <= n <= 127
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #y,dx | * | * | ? | N | * | * | 2
- lsl.l #z,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n = y * 2^z
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #m,dx | Y | Y | ? | Y | * | * | 2
- add.b dx,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- (128 <= n <= 254 or -256 <= n <= -130) and n is even, m = n/2
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #m,dx | Y | Y | ? | * | * | * | 2
- bchg.l dx,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n = -32881 -> m = -113
- n = -32849 -> m = -81
- n = -32817 -> m = -49
- n = -32785 -> m = -17
- n = -16498 -> m = -114
- n = -16466 -> m = -82
- n = -16434 -> m = -50
- n = -16402 -> m = -18
- n = -8307 -> m = -115
- n = -8275 -> m = -83
- n = -8243 -> m = -51
- n = -8211 -> m = -19
- n = -4212 -> m = -116
- n = -4180 -> m = -84
- n = -4148 -> m = -52
- n = -4116 -> m = -20
- n = -2165 -> m = -117
- n = -2133 -> m = -85
- n = -2101 -> m = -53
- n = -2069 -> m = -21
- n = -1142 -> m = -118
- n = -1110 -> m = -86
- n = -1078 -> m = -54
- n = -1046 -> m = -22
- n = -631 -> m = -119
- n = -599 -> m = -87
- n = -567 -> m = -55
- n = -535 -> m = -23
- n = -376 -> m = -120
- n = -344 -> m = -88
- n = -312 -> m = -56
- n = -280 -> m = -24
- n = 264 -> m = 8
- n = 296 -> m = 40
- n = 328 -> m = 72
- n = 360 -> m = 104
- n = 521 -> m = 9
- n = 553 -> m = 41
- n = 585 -> m = 73
- n = 617 -> m = 105
- n = 1034 -> m = 10
- n = 1066 -> m = 42
- n = 1098 -> m = 74
- n = 1130 -> m = 106
- n = 2059 -> m = 11
- n = 2091 -> m = 43
- n = 2123 -> m = 75
- n = 2155 -> m = 107
- n = 4108 -> m = 12
- n = 4140 -> m = 44
- n = 4172 -> m = 76
- n = 4204 -> m = 108
- n = 8205 -> m = 13
- n = 8237 -> m = 45
- n = 8269 -> m = 77
- n = 8301 -> m = 109
- n = 16398 -> m = 14
- n = 16430 -> m = 46
- n = 16462 -> m = 78
- n = 16494 -> m = 110
- n = 32783 -> m = 15
- n = 32815 -> m = 47
- n = 32847 -> m = 79
- n = 32879 -> m = 111
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,dx -> moveq #m,dx | N | N | ? | * | * | * | 2
- bchg.l dx,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n = -2147483617 -> m = 31
- n = -2147483585 -> m = 63
- n = -2147483553 -> m = 95
- n = -2147483521 -> m = 127
- n = -1073741922 -> m = -98
- n = -1073741890 -> m = -66
- n = -1073741858 -> m = -34
- n = -1073741826 -> m = -2
- n = -536871011 -> m = -99
- n = -536870979 -> m = -67
- n = -536870947 -> m = -35
- n = -536870915 -> m = -3
- n = -268435556 -> m = -100
- n = -268435524 -> m = -68
- n = -268435492 -> m = -36
- n = -268435460 -> m = -4
- n = -134217829 -> m = -101
- n = -134217797 -> m = -69
- n = -134217765 -> m = -37
- n = -134217733 -> m = -5
- n = -67108966 -> m = -102
- n = -67108934 -> m = -70
- n = -67108902 -> m = -38
- n = -67108870 -> m = -6
- n = -33554535 -> m = -103
- n = -33554503 -> m = -71
- n = -33554471 -> m = -39
- n = -33554439 -> m = -7
- n = -16777320 -> m = -104
- n = -16777288 -> m = -72
- n = -16777256 -> m = -40
- n = -16777224 -> m = -8
- n = -8388713 -> m = -105
- n = -8388681 -> m = -73
- n = -8388649 -> m = -41
- n = -8388617 -> m = -9
- n = -4194410 -> m = -106
- n = -4194378 -> m = -74
- n = -4194346 -> m = -42
- n = -4194314 -> m = -10
- n = -2097259 -> m = -107
- n = -2097227 -> m = -75
- n = -2097195 -> m = -43
- n = -2097163 -> m = -11
- n = -1048684 -> m = -108
- n = -1048652 -> m = -76
- n = -1048620 -> m = -44
- n = -1048588 -> m = -12
- n = -524397 -> m = -109
- n = -524365 -> m = -77
- n = -524333 -> m = -45
- n = -524301 -> m = -13
- n = -262254 -> m = -110
- n = -262222 -> m = -78
- n = -262190 -> m = -46
- n = -262158 -> m = -14
- n = -131183 -> m = -111
- n = -131151 -> m = -79
- n = -131119 -> m = -47
- n = -131087 -> m = -15
- n = -65648 -> m = -112
- n = -65616 -> m = -80
- n = -65584 -> m = -48
- n = -65552 -> m = -16
- n = 65552 -> m = 16
- n = 65584 -> m = 48
- n = 65616 -> m = 80
- n = 65648 -> m = 112
- n = 131089 -> m = 17
- n = 131121 -> m = 49
- n = 131153 -> m = 81
- n = 131185 -> m = 113
- n = 262162 -> m = 18
- n = 262194 -> m = 50
- n = 262226 -> m = 82
- n = 262258 -> m = 114
- n = 524307 -> m = 19
- n = 524339 -> m = 51
- n = 524371 -> m = 83
- n = 524403 -> m = 115
- n = 1048596 -> m = 20
- n = 1048628 -> m = 52
- n = 1048660 -> m = 84
- n = 1048692 -> m = 116
- n = 2097173 -> m = 21
- n = 2097205 -> m = 53
- n = 2097237 -> m = 85
- n = 2097269 -> m = 117
- n = 4194326 -> m = 22
- n = 4194358 -> m = 54
- n = 4194390 -> m = 86
- n = 4194422 -> m = 118
- n = 8388631 -> m = 23
- n = 8388663 -> m = 55
- n = 8388695 -> m = 87
- n = 8388727 -> m = 119
- n = 16777240 -> m = 24
- n = 16777272 -> m = 56
- n = 16777304 -> m = 88
- n = 16777336 -> m = 120
- n = 33554457 -> m = 25
- n = 33554489 -> m = 57
- n = 33554521 -> m = 89
- n = 33554553 -> m = 121
- n = 67108890 -> m = 26
- n = 67108922 -> m = 58
- n = 67108954 -> m = 90
- n = 67108986 -> m = 122
- n = 134217755 -> m = 27
- n = 134217787 -> m = 59
- n = 134217819 -> m = 91
- n = 134217851 -> m = 123
- n = 268435484 -> m = 28
- n = 268435516 -> m = 60
- n = 268435548 -> m = 92
- n = 268435580 -> m = 124
- n = 536870941 -> m = 29
- n = 536870973 -> m = 61
- n = 536871005 -> m = 93
- n = 536871037 -> m = 125
- n = 1073741854 -> m = 30
- n = 1073741886 -> m = 62
- n = 1073741918 -> m = 94
- n = 1073741950 -> m = 126
- n = 2147483551 -> m = -97
- n = 2147483583 -> m = -65
- n = 2147483615 -> m = -33
- n = 2147483647 -> m = -1
- ------------------------------------+---+---+---+---+---+---+----
- move.l #n,m? -> moveq #n,ds | Y | Y | ? | Y |N/*| y | 2
- move.l ds,m? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- -128 <= n <= 127
- ------------------------------------+---+---+---+---+---+---+----
- move.l (ax),ay -> move.x ([ax],n),dz| - | - | ? | * | * | * | 0
- move.x n(ay),dz | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l (ax),ay -> move.x ([ax]),dz | - | - | ? | * | * | * | 0
- move.x (ay),dz | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l (bd.x,ax),dy -> | - | - | ? | Y | Y | Y | 2
- move.l bd.x,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l (n.w,ax),dy -> | N | N | N | N | N | N | 0
- move.l n(ax),dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l (sp),(n,sp) -> rtd #n | - | - | ? | Y | Y | Y | 6
- lea (n,sp),sp | | | | | | |
- rts | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l 12(ax),12(ay) -> move16 | - | - | - | - | y | ? | 22
- move.l 8(ax),8(ay) (ax)+,(ay)+ | | | | | | |
- move.l 4(ax),4(ay) | | | | | | |
- move.l (ax)+,(ay)+ | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l ax,-(sp) -> link ax,#n | Y | Y | ? | Y | N | Y | 4
- move.l sp,ax | | | | | | |
- add.w #n,sp | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- -32767 <= n <= 32767
- ------------------------------------+---+---+---+---+---+---+----
- move.l ax,-(sp) -> pea -n(ax) | Y | Y | ? | Y | Y | N | 0/4
- sub*.l #n,(sp) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l ax,-(sp) -> pea n(ax) | Y | Y | ? | Y | Y | N | 0/4
- add*.l #n,(sp) | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.l ax,sp -> unlk ax | Y | Y | ? | N | y | N | 2
- move.l (sp)+,ax | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- move.w #x,mn -> move.l #xy,mn | Y | Y | ? | Y | Y | Y |2/4/6
- move.w #y,mn+2 | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- best if mn is longword aligned
- ------------------------------------+---+---+---+---+---+---+----
- move.x #0,ax -> sub.l ax,ax | Y | Y | ? | Y | * | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- move.x #n,ax -> lea n,ax | Y | Y | ? | N | N | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- n <> 0
- ------------------------------------+---+---+---+---+---+---+----
- move.x ax,ay -> lea n(ax),ay | Y | Y | ? | Y | Y | Y | 2/4
- add.x #n,ay | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- -32767 <= n <= 32767
- ------------------------------------+---+---+---+---+---+---+----
- move.x ax,az -> lea -n(ax,dx),az | Y | Y | ? | Y | Y | Y | 2
- sub.x #n,az | | | | | | |
- add.x dx,az | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- az=n+ax+dx, n<=32767
- ------------------------------------+---+---+---+---+---+---+----
- move.x ax,az -> lea n(ax,dx),az | Y | Y | ? | Y | Y | Y | 2
- add.x #n,az | | | | | | |
- add.x dx,az | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- az=n+ax+dx, n<=32767
- ------------------------------------+---+---+---+---+---+---+----
- movem.l (ax)+,registers | * | * | ? | ? | Y | N | *
- -> move.l (ax)+,ry | | | | | | |
- for each reg | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- movem.x *,@ -> move.x *,@ | Y | Y | ? | Y | Y | N | 2
- | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- @ = a single register, not (@=dx & .x=.w)
- ------------------------------------+---+---+---+---+---+---+----
- movem.x @,* -> move.x @,* | Y | Y | ? | Y | Y | N | 2
- | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- @ = a single register, status flags are wrong
- ------------------------------------+---+---+---+---+---+---+----
- moveq #n,az -> lea n(ax,ay.l*2),az | - | - | ? | Y | Y | Y | 4
- add.x ay,az | | | | | | |
- add.x ax,az | | | | | | |
- add.x ay,az | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- az=n+ax+2*ay, -128<=n<=127
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #1,dx -> (nothing) | ! | ! | ! | ! | ! | Y | 6
- ------------------------------------+---+---+---+---+---+---+----
- delete
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #10,dx -> add.l dx,dx | ! | ! | ? | Y | Y | * |-2
- move.l dx,ds | | | | | | |
- asl.l #2,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #12,dx -> asl.l #2,dx | ! | ! | ? | Y | Y | * |-2
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #2,dx -> add.l dx,dx | ! | ! | ? | ! | ! | Y | 4
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #3,dx -> move.l dx,ds | ! | ! | ? | ! | ! | * | 0
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #5,dx -> move.l dx,ds | ! | ! | ? | ! | ! | * | 0
- asl.l #2,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #6,dx -> add.l dx,dx | ! | ! | ? | ! | ! | * |-2
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #7,dx -> move.l dx,ds | ! | ! | ? | ! | ! | * | 0
- asl.l #3,dx | | | | | | |
- sub.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #9,dx -> move.l dx,ds | ! | ! | ? | ! | ! | * | 0
- asl.l #3,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mul*.l #n,dx -> moveq #m,ds | ! | ! | ? | ! | ! | N | 2
- asl.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 8<m<14
- ------------------------------------+---+---+---+---+---+---+----
- muls.l #0,dx -> moveq #0,dx | ! | ! | ? | ! | ! | Y | 4
- ------------------------------------+---+---+---+---+---+---+----
- muls.l #n,dx -> asl.l #m,dx | ! | ! | ? | ! | ! | Y | 4
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 1 <= m <= 8
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #0,dx -> moveq #0,dx | Y | Y | ? | ! | ! | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #1,dx -> ext.l dx | Y | Y | ? | ! | ! | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #10,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-6
- add.l dx,dx | | | | | | |
- move.l dx,ds | | | | | | |
- asl.l #2,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #11,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-8
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l dx,ds | | | | | | |
- asl.l #3,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #12,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-6
- asl.l #2,dx | | | | | | |
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #2,dx -> ext.l dx | Y | Y | ? | ! | ! | N | 0
- add.l dx,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #3,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-4
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #5,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-4
- move.l dx,ds | | | | | | |
- asl.l #2,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #6,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-6
- add.l dx,dx | | | | | | |
- move.l dx,ds | | | | | | |
- add.l ds,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #7,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-4
- move.l dx,ds | | | | | | |
- asl.l #3,dx | | | | | | |
- sub.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #9,dx -> ext.l dx | Y | Y | ? | ! | ! | * |-4
- move.l dx,ds | | | | | | |
- asl.l #3,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #n,dx -> ext.l dx | Y | Y | ? | ! | ! | N | 0
- asl.l #m,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 1 <= m <= 8
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #n,dx -> moveq #m,ds | Y | Y | ? | ! | ! | * |-2
- ext.l dx | | | | | | |
- asl.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 8<m<14
- ------------------------------------+---+---+---+---+---+---+----
- muls.w #n,dx -> swap dx | Y | Y | ? | ! | ! | * |-2
- clr.w dx | | | | | | |
- asr.l #(16-m),dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 8 <= m <= 15
- ------------------------------------+---+---+---+---+---+---+----
- mulu.l #0,dx -> moveq #0,dx | ! | ! | ? | ! | ! | Y | 4
- ------------------------------------+---+---+---+---+---+---+----
- mulu.l #n,dx -> lsl.l #m,dx | ! | ! | ? | ! | ! | Y | 4
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 1 <= m <= ?
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #0,dx -> moveq #0,dx | Y | Y | ? | ! | ! | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #1,dx -> swap dx | Y | Y | ? | ! | ! | * |-2
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #12,dx -> swap dx | Y | Y | ? | ! | Y | * |-10
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- asl.l #2,dx | | | | | | |
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #2,dx -> swap dx | Y | Y | ? | ! | ! | * |-4
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- add.l dx,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #3,dx -> swap dx | Y | Y | ? | ! | ! | * |-8
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- move.l dx,ds | | | | | | |
- add.l dx,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #5,dx -> swap dx | Y | Y | ? | ! | ! | * |-8
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- move.l dx,ds | | | | | | |
- asl.l #2,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #6,dx -> swap dx | Y | Y | ? | ! | ! | * |-10
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- add.l dx,dx | | | | | | |
- move.l dx,ds | | | | | | |
- add.l ds,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #7,dx -> swap dx | Y | Y | ? | ! | ! | * |-8
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- move.l dx,ds | | | | | | |
- asl.l #3,dx | | | | | | |
- sub.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #9,dx -> swap dx | Y | Y | ? | ! | ! | * |-8
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- move.l dx,ds | | | | | | |
- asl.l #3,dx | | | | | | |
- add.l ds,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #n,dx -> swap dx | Y | Y | ? | ! | ! | * |-4
- clr.w dx | | | | | | |
- swap dx | | | | | | |
- lsl.l #m,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 1 <= m <= 8
- ------------------------------------+---+---+---+---+---+---+----
- mulu.w #n,dx -> swap dx | Y | Y | ? | ! | ! | * |-2
- clr.w dx | | | | | | |
- lsr.l #(16-m),dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, 8 <= m <= 15
- ------------------------------------+---+---+---+---+---+---+----
- neg.x dx -> add.x dx,dy | Y | Y | Y | Y | Y | Y | 2
- sub.x dx,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- dx is trashed
- ------------------------------------+---+---+---+---+---+---+----
- neg.x dx -> eor.x #n-1,dx | Y | Y | ? | Y | Y | Y | 2
- add.x #n,dx | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- n is 2^m, dx<n
- ------------------------------------+---+---+---+---+---+---+----
- neg.x dx -> sub.x dx,dy | Y | Y | Y | Y | Y | Y | 2
- add.x dx,dy | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- dx is trashed
- ------------------------------------+---+---+---+---+---+---+----
- nop -> (nothing) | Y | Y | ? | Y | Y | Y | 2
- ------------------------------------+---+---+---+---+---+---+----
- remove nops
- ------------------------------------+---+---+---+---+---+---+----
- or.l #n,dx -> bset.l #b,dx | Y | Y | ? | Y | * | N | 2
- ------------------------------------+---+---+---+---+---+---+----
- n = 2^b (only 1 bit set)
- ------------------------------------+---+---+---+---+---+---+----
- sub*.x #0,dx -> tst.x dx | Y | Y | ? | Y | N | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- sub.x #n,* -> addq.x #-n,* | Y | Y | ? | Y | y | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- -8 <= n <= -1
- ------------------------------------+---+---+---+---+---+---+----
- sub.x #n,* -> subq.x #n,* | Y | Y | ? | Y | y | N | 2/4
- ------------------------------------+---+---+---+---+---+---+----
- if 1 <= n <= 8
- ------------------------------------+---+---+---+---+---+---+----
- sub.x #n,ax -> lea -n(ax),ax | Y | Y | ? | Y | y | N | 0/2
- ------------------------------------+---+---+---+---+---+---+----
- -32767 <= n <= -9, 9 <= n <= 32767
- ------------------------------------+---+---+---+---+---+---+----
- subq.l #n,ax -> subq.w #n,ax | Y | Y | ? | N | N | N | 0
- ------------------------------------+---+---+---+---+---+---+----
- subq.w #1,dx -> db<cc> dx,?? | y | y | ? |y/*|N/*|y/*|-2
- b<cc> ?? b<cc> ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- if dx=0 then will be slower
- ------------------------------------+---+---+---+---+---+---+----
- subq.w #1,dx -> dbf dx,?? | Y | Y | ? |y/*|N/*|y/*|-2
- bra ?? bra ?? | | | | | | |
- ------------------------------------+---+---+---+---+---+---+----
- if dx=0 then will be slower
-
- ---------------------------------------------------------------------------
- H I N T S & T I P S
- ---------------------------------------------------------------------------
-
-
- This new section is for stuff that cannot be included in the above tables.
- This can include pipelining optimizations and other stuff.
-
- 020+ Sequential memory accesses can cause pipeline stalls, so try and
- rearrange code so memory accesses do not immediately follow each
- other. The same problem occurs if an address register updated
- in one line is accessed in the next line.
-
- ALL Include small routines as macros, because inline routines will
- be much faster, and in extreme cases smaller.
-
- ALL If a subroutine is only called from one position, either move
- it inline, or only use jmp/bra commands.
-
- # 7th edition :
-
- The 020+ hint stated above should only apply to 68020. Since 68030, 68040
- and 68060 have a data cache, sequential memory accesses speed things up.
-
- ALL Keep all datas aligned on their respective boundaries : word aligned
- for words, long-word aligned for long-word.
-
- ALL Keep branch target addresses to an even multiple of 8.
-
- 040+ Don't use FPU registers for temporary values : You should expect
- rounding problems, and fmove dx,fpx/fmove fpx,dx are slower then
- move dx,mem/move mem,dx when mem is cached.
-
- 060 Use mulu.x and muls.x when the factor is not an even multiple of 2.
- This takes only 2 cycles !
-
- 030+ Use LSL instead of ASL when possible.
-
- 020+ Scale factor with indirect addressing modes doesn't add any time
- penalty.
-
- 040 Don't use PC indirect addressing modes.
-
- 040+ Avoid NOP instruction for timing purpose. This takes 8 cycles (040)
- or 9 cycles (060). On 68040, this instruction also interlocks the
- effective address calculate and execute stages and synchronizes some
- portions of the processor before execution. It may have the same
- side effect on 68060 (I'm not sure about that).
-
- 040+ Only use move16 on large blocks. It's not really faster than 4
- successive move, but it bypass the data cache¹ (this avoid multiple
- reads/writes of cache lines, and keeps the cache valid for further
- local/global data accesses). Data blocks should be aligned on a
- 16 bytes boundary.
- ¹ move16 still read/write cached datas.
-
- 030 B<cc>.s takes less time when the branch is not taken (4 cycles). All
- other conditions take 6 cycles.
-
- 040 Branches taken need 2 cycles. Branches not taken need 3 cycles.
-
- 020+ Avoid bit fields instructions. But some of them may be faster in
- some rare situations.
-
- 060 Avoid the use of the same register(s) in two consecutive lines. This
- may avoid the second instruction to be dispatch in the second
- pipeline. Only most of the arithmetical/logical instructions, and
- move from/to registers can be dispatched.
-
-
- 030+ Instruction and data cache tips.
- -------------------------------------
-
- This is a small explanation of how the caches work. This may help you to
- optimize your code to take advantage of these caches.
-
- These processors always fill their caches on a line basis : They load a
- 16 bytes wide memory block, aligned on a 16 bytes boundary for each
- cache read/write operation. These caches are 4Kb wide (68040) or 8kb
- wide (68060).
- When a cache line read is initiated, the first memory cycle attenmps to load
- the line entry corresponding to the instruction half-line (8 bytes) or data
- item requested by the integer unit. Subsequent transfers are for the
- remaining entries in the cache.
- WriteThrough and CopyBack cache modes work differently when a write occur.
- In WriteThrough mode, the integer unit updates both the cache entry and the
- memory. A write miss never cause a cache line fill.
- In CopyBack mode, the integer unit only refresh the cache entry. The memory
- will be updated when the line will be replace by another line. A write miss
- causes an entire line read, then the corresponding cache entry is updated.
-
- The 68030 data cache burst mode work as stated above. If the cache burst
- mode is disabled, the 68030 only read/write partial cache lines.
-
-
- By now, we can see that (assuming cache burst mode for the 68030):
- - After a data read operation, subsequent accesses within the same data cache
- line are faster. The same is true after a write access in CopyBack mode.
- BTW, these subsequent accesses should be defered to be sure that the entire
- cache line is ready.
- - Sporadic writes should be done in a WriteThrough memory page to avoid
- uneeded cache fills. WriteThrough mode can be enable on 040/060 based
- machines : you have to modify the MMU page descriptors to do so.
- - Keep your local datas together. Doing so, you need less cache lines to
- cache these datas. i.e., if you read/write the same 4 longwords in a loop,
- it's better to have them in the same cache line.
- - Keep your functions together too (040/060). Since the MMU is used to select
- the cache mode on a page basis, this may avoid uneeded MMU page descriptors
- fetches. These MMU pages are most often 4Kb wide, aligned on a 4Kb boundary.
- - A linear code may be worst than a loop, due to instruction cache line
- read timings. As a side note, it may be better to use small slow
- instructions than large quicker ones. The 68060 can execute an entire
- instruction cache line in less than 4 cycles, which is not enough to load
- the next cache entry : the instruction unit will have to wait for the next
- instruction !
- - Don't flush the caches all the time. (5394 cycles for the whole data
- cache, on 68060, no wait states)
-
-
-
- ---------------------------------------------------------------------------
- C O N C L U S I O N
- ---------------------------------------------------------------------------
-
-
- There are the optimizations i've come up with so far. If you could check
- what i've done, and report any errors, that would make this list better. I
- only have so much time to spend on this, and many hands make light work.
- Also, stats (and more optimizations) for 68020+ CPU's would be welcomed.
- Currently this list is only for simple peephole optimization stuff, but I
- will hopefully get around to more extensive optimizations. Pipeline
- optimization is on the way, so look out. Any info on the 68020+ pipelines
- would be appreciated.
-
- Optimizations with ?question-marks? in the boxes next to them, I do not
- have the data to check yet.
-
- The latest version of the asp68k archive is available by anonymous ftp from
- ftp.mq.edu.au in the /home/mglew/ directory or by calling Technophilia BBS
- on +61 2 807 3563 (or (02) 807 3563 in Australia).
-
-
- ===========================================================================
- EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF EOF
- ===========================================================================
-